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30 min at 37°C, cither with or without preheating at 90°C for 5 min, followed 
by electrophoresis in a 15% denaturing gel. 

For detection of rev- EGFP mRNA, we used a 25-mer deoxyribonu- 
cleotide probe that was complementary to the EGFP mRNA of the rev- 
KGFP fusion protein. A 29-mer deoxyribooligonucleotide probe was used 
for detection of the GAPDH transcript. 

HIV-1 antiviral assay. For determination of anti-HIV-1 activity of the si RN As, 
transient assays were done by cotransfeclion of siDNAs and infectious HIV-1 
proviral DNA, pNL4-3 into 293 cells as described' s . Before transfection, the 
cells were grown for 24 h in six-well plates in 2 ml EM EM supplemented with 
10% (vol/ vol) FBS and 2 mM L-glutamine, and transfected using 
l.ipofectamine Plus reagent (Life Technologies, GibcoBRL) as described by the 
manufacturer. The DNA mixtures consisting of 0.5 ug siDNAs or controls, and 
0.5 Ug pNL4-3 were formulated into cationic lipids and applied to the cells. 
After one, two, three, and four days, super nalants were collected and analyzed 
for HIV- 1 p24 antigen (Beckman Coulter, Hialeah, FL). The p24 values were 
calculated with the aid of the Dynatech MR5000 ELISA plate reader 
(Dynatech Labs Inc., Chantilly, VA). Cell viability was also assessed using a 
Trypan Blue dye exclusion count at four days after transfection. 
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Effective expression of small 
interfering RNA in human 
cells 



Cynthia P. Paul 1 , Paul D. Good 1 , Ira Winer 2 , 
and David R. Engelke 1 - 2 * 

In many eukaryotes, expression of nuclear-encoded mRNA can 
be strongly inhibited by the presence of a double-stranded RNA 
(dsRNA) corresponding to exon sequences in the mRNA 
(refs 1 ,2). The use of this "RNA interference" (RNAi) in mam- 
malian studies had lagged well behind its utility in lower animals 
because uninterrupted RNA duplexes longer than 30 base pairs 
trigger generalized cellular responses through activation of 
dsRNA-dependent protein kinases 3 . Recently it was demonstrat- 
ed 4 that RNAi can be made to work in cultured human cells by 
introducing shorter, synthetic duplex RNAs (-20 base pairs) 
through liposome transfection. We have explored several strate- 
gies for expressing similar short interfering RNA (siRNA) duplex- 
es within cells from recombinant DNA constructs, because this 
might allow long-term target-gene suppression in cells, and 
potentially in whole organisms. Effective suppression- of target 
gene product levels is achieved by using a human U6 small 
nuclear RNA (snRNA) promoter to drive nuclear expression of a 
single RNA transcript. The siRNA-like parts of the transcript con- 
sists of a 19-base pair siRNA stem with the two strands joined by 
a tightly structured loop and a U t -4 3' overhang at the end of the 
antisense strand. The simplicity of the U6 expression cassette 
and its widespread transcription in human cell types suggest that 
this mode of siRNA delivery could be useful for suppressing 
expression of a wide range of genes. 

The U6 snRNA promoter cassettes and si- like RNA inserts are shown 
in Figure I. We previously showed that RNA expressed by RNA poly- 
merase III from the U6+1 or U6+27 cassettes was expressed primar- 
ily as full-length transcripts and was located in the nucleus 5 * 6 . U6+27 
transcripts, containing the first 27 nucleotides of human U6 RNA, 
were capped with y-methyl phosphates and accumulated to higher 
levels than U6+1 transcripts. Cassettes are designed so that short 
RNA coding sequences are inserted between unique Sad and Xbal 
sites. After the Xbal site, the cassette encodes a strong stem to protect 
the transcripts against 3'-5' exonuclease attack, then a poly(U) tran- 
scription termination sequence. However, the insertion sequences 
discussed later also contain their own UUUU terminator at the 
3' end of the inserted sequences, terminating most transcription 
before the cassette- encoded stem/terminator region. 

To test whether expressed si- like RNA is effective, we targeted a 
site in human lamin A/C mRNA that has been demonstrated to be 
vulnerable to synthetic siRNA 4 . The inserted sequences encoded 
several variants of siRNA duplexes and controls, shown in Figure 
IB. Previous work on synthetic anti-lamin A/C siRNA used two 
independent strands with 3' unpaired tails 4 . Although it would be 
theoretically possible to synthesize two strands independently in 
vivo, the need to anneal the two strands could make the production 
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of siRNA inefficient, and might complicate routine cloning and 
expression of the siRNA constructs. However, as seen elsewhere in 
this issue (Lee era/., p. 500 ), the synthesis of the siRNA as indepen- 
dent strands from U6 promoter can also be effective. 

To make the siRNA duplex as one short transcript, an RNA insert 
was used that contains the 19-nucleotide sense strand of the target, 
followed by a UUCG tetraloop sequence 7 , the antisense strand, and a 
UUUU transcription terminator, in that order. This terminates a 
high percentage of the transcripts exactly at the end of the siRNA 
stem 5 . The 3'-UUUU overhang after the siRNA is attacked by 3' 
exonucleases, leaving I to 4 U 3'-end overhangs 5 . Results with syn- 
thetic siRNA 4 suggest that such 3' overhangs can increase efficacy. 
RNA blot analysis has shown that high levels (~10 4 -10 5 RNA mole- 
cules/cell) of nearly full-length RNAs can be expressed from these 
cassettes s . The "hairpin siRNAs" give comparable expression levels, 
although there is a complex pattern of breakdown products as well as 
the full-length product (not shown). 

Figure 2 shows elimination of lamin A/C protein when HeLa cells 
were transiently transfected with either synthetic siRNA or siRNA- 
expressing clones. Cells shown in Figure 2A underwent 
Oligofectamine- mediated transfection with either no RNA or a syn- 
thetic 19-base pair siRNA duplex with 3'-TT overhangs 4 . Nuclei in 
Figure 2A were visualized with 4,6-diamidino-2-phenylindole (DAPI) 
staining (blue). As expected from previous work, the lamin A/C signal 
(red) substantially disappears from most cells, presumably those that 
are transfected. For testing recombinant DNAs, cells were cotransfected 
with a plasmid (pCMVp) expressing p-galactosidase ((5-Gal) to mark 
transfected cells. Production of ^-Gal also precludes the possibility that 
siRNA constructs nonspedfically obstruct protein synthesis. Figure 2B 



Figure 1. Expression cassettes and small RNA inserts. (A) The two U6 snRNA 
promoter 12 - 14 expression cassettes used to express siRNAs and controls are 
shown with the expected transcripts by RNA polymerase ill. assuming no 
UUUU terminators in the RNA insert. Cassettes had either no remaining U6 
snRNA sequences (U6+1 ) or the first 27 nucleotides of U6 snRNA (U6+27) to 
direct methylation of the S'-y-phosphate and stabilize the transcript 5 . With the 
inserts shown, most transcription terminates with the insert UUUU, but 
readthrough to the cassette stem terminator also occurs. (B) Four tested anti- 
lamin RNA inserts are shown. Each would begin immediately after the Sa/I 
sequence from the cassette, and most termination occurs after the UUUU at 
the insert 3' terminus (ref. 5 and data not shown). 



shows individual frames with staining for lamin A/C (red), p-Gal 
(green), or the overlay of the two signals. Without the siRNA inserts, 
cells transfected with any of the expression cassette ptasmids do not 
have detectably reduced lamin A/C signal (shown only for U6+ 1 in Fig. 
2B). When either U6+1 or U6+27 cassettes were used with anti-Iamin 
hairpin siRNA inserts, dramatic reductions of lamin A/C signal were 
observed relative to the untransfected cells in the same fields. 
Transfected cells receiving the U6+27-siRNA expression cassettes gave 
the most consistent and greatest lamin A/C reductions (>90%, Table 1 ), 
similar to synthetic siRNA (-95%). This might reflect a threshold effect 
caused by lower levels of the U6+ 1 -expressed siRNA 5 . 

Figure 2C shows lamin A/C-P-Gal overlay panels for control 
RNAs expressed from the U6+27 cassette. Expression of only the 
sense or only the antisense strands of the siRNA in U6+27 did not 
affect lamin levels, reinforcing the notion that the observed reduc- 
tion in Figure 2B requires the duplex, a hallmark of siRNA action. 

We next tested a U6+27 hairpin siRNA construct with the order of 
the strands reversed to determine the specific need for an accessible 3' 
overhang on the antisense strand of the duplex. Some models for 
siRNA function predict that siRNA degradation of the target message is 
amplified by annealing of the antisense strand to the mRNA and exten- 
sion to a longer duplex with an RNA- dependent RNA polymerase. This 
condition would indicate the need to have an accessible antisense 3' ter- 
minus so that it can be extended. Surprisingly, there was a significant 
reduction of the lamin signal with the reversed-strand construct, 
although it was not as consistent or effective as the original orientation. 
It is not clear why the reversed-strand construct causes partial reduc- 
tion of the lamin signal. It is possible that small amounts of breakdown 
products with 3'-UU overhang are created on the antisense strand of 
the reversed construct by 3' exonuclease digestion or a discrete endonu- 
clease cleavage between the strands. Alternatively, these hairpin siRNAs, 
when expressed within the cells, might not need to act exclusively 
through primer extension amplification. Although the active form of 
the nuclear-expressed RNAs will require long-term investigation, we 
recommend that siRNA transcripts have the sense strand first, followed 
by a tetraloop and antisense strand ending with a 3' overhang created 
by the poly(U) terminator. 

Previous studies of siRNA- mediated target cleavage by extracts 
in vitro suggested that the 5' termini of one or both strands might 
need to be phosphorylated, and that this might be needed for efficient 
assembly into obligatory ribonudeoprotein complexes 8 ' 9 . Results pre- 



Table 1. Effect of siRNA and expression cassettes on the levels 
of the lamin A/C protein in transfected cell nuclei 



Construct 



Percentage lamin A/C In 
transfected vs. nontransfected cells 



pAVU6+27 No insert 1 30 ± 5 

Synthetic anti-Iamin siRNA 5 ± 2 

pAVU6+27 Anti-Iamin siRNA hairpin 9 ± 5 

pAVU6+27 Sense strand only 130 ±40 

PAVU6+27 Antisense strand only 1 30 ± 30 

pAVU6+27 Reverse-strands hairpin 25 ± 14 
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sented here suggest that the hairpin siRNAs might not need to have 
the 5' end of either strand unblocked. These hairpin siRNA observa- 
tions would be consistent with a mechanism in which the U6 tran- 
script containing the hairpin duplex RNA is able to assemble into any 
necessary protein complexes. The moderate preference for the anti- 
sense strand at the end agrees with the prediction that the antisense 
strand is used as a primer for RNA-dependent RNA polymerase on 
the message target 10 , but this appears not to be essential. 

Another unexpected finding was that expression from the U6 
snRNA promoter cassettes, which give primarily nucleoplasms 
expression 5 , would succeed in inhibiting target expression when a 
majority of the existing mRNA is cytoplasmic. To be certain that 
the long hairpin did not cause altered localization, we carried out 
in situ hybridization with fluorescent probes to the hairpin that 
showed nuclear localization very similar to that seen previously 
(Fig. 3). Admittedly, a small percentage of the U6-driven tran- 

Figure 3. Localization of U6+27 siRNA transcripts. Two days after 
transfection with the U6+27 anti-lamin A/C cassette plasmid, ceils were fixed 
and stained for nuclear ONA (DAPI. blue) and probed with a Cy3-labeied 
2'O-methyl oligoribonucleotide (red) complementary to the antisense strand 
of the siRNA. As expected from work with previous U6 expression 
constructs 5 - 6 , the U6+27 siRNA pattern was primarily in a nuclear speckled 
pattern. Nuclear and cytoplasmic background staining by the Cy3- 
oligonucleotide in the absence of U6+27-siRNA ("Mock") was minimal. 



Figure 2. Effects of siRNA constructs on lamtn A/C levels. HeLa cells were 
transfected with either synthetic siRNA or recombinant DNA cassettes 
expressing different small RNAs from different RNA polymerase III 
promoters. Cells were stained with DAPI (blue) or with antibodies to lamin 
A/C (red) or p-Gal (green). (A) Synthetic siRNA or no RNA transfections. 
showing that lamin A/C staining of the nuclear periphery is largely abolished 
in most cells, with only low levels of residual red staining in nuclear interiors. 
(B) Transfection with U6 promoter cassettes either without an siRNA insert 
(U6+1. no insert) or containing the anti-lamin siRNA shown in Figure 1B 
(U6+1 siRNA and U6+27 siRNA). Transfected cell cytoplasms are green, 
whereas nuclei from un transfected cells show no green cytoplasm. Empty 
expression cassettes have no apparent effect on lamln A/C levels (only 
empty U6+1 is shown), while transfected cells (green) using siRNA- 
expressing constructs have little remaining lamin A/C (red). (C) Overlay 
panels of 0-Gal and lamin A/C signal after transfection with different control 
insertions shown in Figure 1 B. Quantitative assessment of remaining lamin 
A/C signals in transfected cells compared to untransfected cells on the same 
slide is given in Table 1 . 



scripts might exit to the cytoplasm and be active there. These 
results suggest that the U6-driven transcripts are suppressing pre- 
mRNAs before nuclear exit. 

The U6 expression cassettes used in these studies are <400 base pairs 
long and should be relatively easy to incorporate into a variety of vec- 
tors. The siRNA inserts can be synthesized as complementary 
oligodeoxynucieotide pairs to rapidly create cassettes directed at multi- 
ple sites. It should even be possible to use several cassettes per vector, 
targeted at either multiple mRNAs or multiple sites on the same mes- 
sage. It is likely that the hairpin siRNA strategy will be applicable to 
many mRNA targets. Preliminary experiments targeting both an 
endogenous human splicing factor and HIV-1 reverse transcriptase 
coding region (A. Ehsani, S. Li, A. Kleihauer, and J.J. Rossi, personal 
communication) have shown the hairpin siRNA strategy to be effective. 
However, as with the synthetic siRNAs, it is sometimes necessary to test 
several target sites along an mRNA to find one that gives the strongest 
inhibition. While much remains to be learned about the mechanism by 
which these transcripts work, the results with the simple U6 cassettes 
suggest that they might be useful for diverse experimental applications. 



Experimental protocol 

Materials. Lipofectin, Plus reagent, and OHgofectamine were purchased from 
Invitrogen (Carlsbad, CA), as were synthetic DNA oligonucleotides for 
cloning and probes. Cy3- 2'- O- methyl RNA oligonucleotide hybridization 
probes were from Trilink (San Diego, CA). Synthetic siRNA oligonucleotides 
were from Dharmacon (Lafayette, CO). Anti-lamin A/C monoclonal anti- 
bodies were purchased from Santa Cruz Biotechnology (Santa Cruz, CA) 
(sc-7292, used at I Ug/ml); rabbit anti-fi-Gal antibodies were from Molecular 
Probes (Eugene. OR) (A- 11 132, 1 Ug/ml); Oregon green 488-labeled goat 
anti-rabbit secondary antibodies were from Molecular Probes (O- 11038, 
5 ug/ml); and cyan in- 3 (Cy3) -labeled goat anti- mouse secondary antibodies 
were from Amersham- Pharmacia Biotech (Piscataway, NJ) (PA 43002, 
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I ug/ml). Cassettes 5 were cloned in pAV vectors, derived from pCWRSVN 
(ref. 11 ) by placing the promoter modules between Bam HI and HindUl sites, 
after modifying the vector. Modifications included destruction of the BamHl 
site downstream of the Neo cassette, and removal of all sites between the orig- 
inal Sail and Xhol sites, inclusive, by cleavage and rc ligation. After inserting 
the cassettes, a new polylinker was created between the Hmdlll and SacII 
sites; Sequences to be expressed were inserted as synthetic oligodeoxynu- 
cleo tides precisely between the end of the unique Sail site and the beginning 
of the unique Xbal site. Recombinant constructs were sequenced. 

Transfecttons. Transient transfecttons were carried out on subcontluent 
HeLa cells. Synthetic RNA was transfected using Oligofectamine as 
described 4 . Recombinant DNA constructs were transfected using 
Lipofectin with Plus reagent according to the manufacturer's instructions. 
In transient transfecttons, cells were split after one day. Cells were fixed 
and examined for lamin protein after three days, and fixed and examined 
by in situ hybridization after two days. 

Fluorescence microscopy. Transfected cells were fixed and subjected to previ- 
ously described protocols for visualizing proteins 4 with antibodies (lamin 
A/C and (5-Gal) or detecting small RNAs (http://singer1ab.aecom.yu.edu/pro~ 
tocols) by hybridizing 5'-Cy3-labeled oligos (5'-Cy3-AAACUGGACU- 
U CCAG AAG AACACG AA , 2' 0- methyl ribonucleotides) to the fixed prepa- 
rations. Fluorescence was acquired with a Nikon Eclipse E800 (Tokyo, Japan) 
with a Hamamatsu Orca II camera (Hamamatsu-Ctty, Japan). For each con- 
struct, hundreds of cells were examined to confirm that the selected images 
were representative. On multiple slides, lamin A/C fluorescence in transfected 
cells was deconvoluted and quantitated using Isee software (Inovision; 
Raleigh, NC) and is expressed in Table 1 as a percentage of lamin A/C signal 
from nontransfected cells on the same slides. Lamin signal was consistently 
higher in transfected cells than in untransfected cells on the same slide. 
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Using the transcriptome to 
annotate the genome 

Saurabh Saha 1 - 2 *, Andrew B. Sparks 1 - 3 *, Carlo Rago 1 , 
Viatcheslav Akmaev 4 , Clarence J.Wang 4 , 
Bert Vogelstein 1 , Kenneth W. Kinzler 1 *, 
and Victor E. Velculescu 1 * 

A remaining challenge for the human genome project involves the 
identification and annotation of expressed genes. The public and 
private sequencing efforts have identified -15,000 sequences that 
meet stringent criteria for genes, such as correspondence with 
known genes from humans or other species, and have made 
another -10,000-20,000 gene predictions of lower confidence, 
supported by various types of in silico evidence, including homol- 
ogy studies, domain searches, and ab initio gene predictions 1 - 2 . 
These computational methods have limitations, both because 
they are unable to identify a significant fraction of genes and 
exons and because they are unable to provide definitive evidence 
about whether a hypothetical gene is actually expressed 3 - 4 . As the 
in silico approaches identified a smaller number of genes than 
anticipated 5 - 9 , we wondered whether high-throughput experimen- 
tal analyses could be used to provide evidence for the expression 
of hypothetical genes and to reveal previously undiscovered 
genes. We describe here the development of such a method — 
called long serial analysis of gene expression (LongSAGE), an 
adaption of the original SAGE approach 10 — that can be used to 
rapidly identify novel genes and exons. 



The LongSAGE method (Fig. 1) generates 21 bp tags derived from 
the 3' ends of transcripts that can rapidly be analyzed and matched 
to genomic sequence data. The method is similar to the original 
SAGE approach 10 , but uses a different type IIS restriction endonucle- 
ase (Mmel) and incorporates other modifications to produce longer 
transcript tags. The resulting 21 bp tag consists of a constant 
4 bp sequence representing the restriction site at which the transcript 
was cleaved, followed by a unique 17 bp sequence derived from an 
adjacent sequence in each transcript. Theoretical calculations show 
that >99.8% of 21 bp tags are expected to occur only once in 
genomes the size of the human genome (Table I A). Likewise, similar 
analyses based on actual sequence information from -16,000 known 
genes suggest that >75% of 21 bp tags would be expected to occur 
only once in the human genome, with the remaining tags matching 
duplicated genes or repeated sequences (as discussed below). In con- 
trast, conventional SAGE tags of 14 bp do not allow unique assign- 
ment of tags to genomic sequences, though they do allow such 
assignment to the much less complex compendium of expressed 
sequence tags (ESTs) and previously characterized mRNAs ,0 ~ 12 . To 
optimize the quantification of transcripts, tags are ligated together to 
form "ditags," which are then concatenated and cloned. Sequencing 
tag concatemers in parallel allows the identification of up to -30 tag 
sequences in each sequencing reaction. Matching tags to genome 
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